Automatic Labeling of Rss Articles Using Online Latent Dirichlet Allocation
نویسندگان
چکیده
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
منابع مشابه
Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملDistributed Online Learning for Latent Dirichlet Allocation
A major obstacle in using Latent Dirichlet Allocation (LDA) is the amount of time it takes for inference, especially for a dataset that starts out large and expands quickly, such as a corpus of blog posts or online news articles. Recent developments in distributed inference algorithms for LDA, as well as minibatchbased online learning algorithms have offered partial solutions for problem. In th...
متن کاملTopic Extraction and Bundling of Related Scientific Articles
Automatic classification of scientific articles based on common characteristics is an interesting problem with many applications in digital library and information retrieval systems. Properly organized articles can be useful for automatic generation of taxonomies in scientific writings, textual summarization, efficient information retrieval etc. Generating article bundles from a large number of...
متن کاملOnline Learning for Latent Dirichlet Allocation
We develop an online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA). Online LDA is based on online stochastic optimization with a natural gradient step, which we show converges to a local optimum of the VB objective function. It can handily analyze massive document collections, including those arriving in a stream. We study the performance of online LDA in several ways, ...
متن کاملClassifying Scientific Publications Using Abstract Features
With the exponential increase in the number of documents available online, e.g., news articles, weblogs, scientific documents, effective and efficient classification methods are required in order to deliver the appropriate information to specific users or groups. The performance of document classifiers critically depends, among other things, on the choice of the feature representation. The comm...
متن کامل